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FRAMEWORK 



The present paper is related to the development of cognitive approaches in test analysis. 
Its main goal is to describe an approach revealing the hierarchical test structure (HST), 
based on the cognitive demands of the test items, and conducting a linear latent trait modeling by 
using the HST elements as item difficulty components. To bet'jr define this approach, referred 
to here as Hierarchical Latent Trait Approach (HLTA), let me define in more detail the 
concept of HTS in the sense it is used here. 

Assuming the cognitive operations (CO) required by the test items have been established, 
we say that a given (i-th ) item is "inferior" to another 0-th) item only if the set of cognitive 
operations required by the i-th item (CO, ) is a part of the set of cognitive operations required by 
the j-th item (CO, ), i.e. CO, c CO, . Concomitantly, the j-th item is said to be "superior" to the 
i-th item In this sense, we say that i-th and j-th items are (cognitively) related in hierarchical 
order. The following algorithmic rule allows us to allocate all test ifms to hierarchically ordered 
levels and, as a matter of fact, defines the hierarchical test structure (HTS): 

HTS rule: "Level L k follows level L k ., in the increasing hierarchical order only if any item 
allocated to level L k is "superior", as defined above, to at least one item allocated to level L M ." 

Thus, the HTS is objectively determined by the above rule and any (say, i-th) test item is 
characterized by three components: L, - level on which the item was allocated, I, - number of 
items "inferior" to the item, and S,. number of items "superior" to the item. The HTS can be also 
thought of as a "tree-type" graph diagram where all points (items) are located at hierachically 
ordered levels and an "arrow M goes from a given (i-th) point to another (j-th ) point only if the 
item represented by the j-th point is "superior" to the item represented by the i-th point. In this 
sense S, will be the number of "arrows" that go from the i-th item and I, will be the number of 
"arrows" that enter the i-th item in the HTS graph . Obviously, I = 0 for all items located at the 
lowest HTS level a.id S = 0 for all items located at the highest HST level (see fig.l). 

The next step in the description of the Hierarchical Latent Trait Approach (HLTA) is 
related to the use of the (L,I,S) cognitive information components, as defined above, for 
predicting and analyzing the difficulty parameter of the test items. As a type of model, the model 
used here is a linear logistic latent trait model ( Fischer, 1973; Embretson, 1984): 

(1) P{ Xij = ll9 y ,Ti t ,a) = exp[e,-(ICftTi t +flf)]/{l+exp[e;-(Zc 4t Ti t +flf)] 

where: x v = the response of person j to item i; 
9, = ability for person j; 

r\ k = trn difficulty of complexity factor k (part of the total item difficulty); 
c /t = the coefficient of r\ k in the linear representation of the item difficulty by 
the set of complexity factors r\],r\2, ■ •, Tim ; 
a = a normalization constant. 
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In particular, the HLTA idea is to use the (L,I,S) cognitive information components as 
complexity factors in model (1), i.e. 

Hi = L (the item level in UTS); the respective coefficient c,, will be the numerical order 

of the level in the hierarchical test structure (HTS). 
Ha = I (the "input" cognitive information of the item); the c a coefficient will show the 

number of items "inferior" to the i-th item. 
r| 3 = S (the "output" cognitive information of the item) ; the c, 3 coefficient will show the 
number of items "superior" to the i-th item. 

The item difficulties calculated by using model (1) with (L,I,S) complexity components 
will be referred to here as HLTA item difficulties. A Fortran program, LIN LOG (that was 
kindly provided by Dr. Susan Embretson of the University of Kansas) was used for the calculation 
of the x\ k values, the HLTA item difficulties, and the regression of the Rasch item difficulties on 
the HLTA item difficulties. 

The HLTA as presently applied also includes some other validation procedures such as 
multiple regression analysis, with (L, I, S) predictors of the item difficulty, and cluster analysis 
described bellow in more details. 

METHODS: 

Data source: Two tests (midterm and final) in a statistics course for undergraduate 

students in S1UC, Spring, 1994. 
Student sample size: 47 for the midterm test and 49 for the final test. 
Item sample size: 20 items on the midterm test and 24 items on the final test. 
Item domain: reproduced (with adaptations) from the set of examples and exercises m the 

course textbook (Moore, D. & McCabe, G., 1993). 

PROCEDURES 

1. Determination of cognitive operations required by the test items: 

On the basis of expert analysis, nine cognitive operations (CO) were inferred from the 
process of solving a set of items validated in consistence with domain parameters, educational 
goals, etc. These operations were defined at a suitable level of generality appropriate for the type 
of items. In (he CO description, given bellow, CRP stands for "Concept, Rule, or Principle": 

CO, : Straight CRP identification from a set of given options. 

COj : CRP identification based on straight inference from a verbal interpretation. 
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C0 3 : CRP identification based on inference from implicit contextual information. 

C0 4 : Usage of CRP for straight inference, justification, or explanation of statement, decision, 

situation, or phenomenon. 
C0 5 : Straight application of simple routine procedure (usage of statistical formula, table, 

histogram, diagram, etc.). 
CO s : Solving familiar algorithm-type problems (outlining the design of familiar types of 

experiments, testing familiar types of hypotheses about means, proportions, etc.) 
C0 7 : Solving familiar non algorithm-type problems (usage of familiar and practiced 
procedures, after an appropriate, interpretation and/or classification processing). 
C0 8 : Solving unfamiliar "jointing-type" problems (selection of appropriate rules, procedures, 

etc., and "jointing" them in a solving method on the base of logical relations). 
CO, : Solving unfamiliar "analysis-type" problems (solving enhanced CO g -type problems, 
discerning patterns and/or tendencies, evaluation processing, etc.). 

The results of analysis showing which of the cognitive operations CO,, CO, are 
required by the respective test items can be summarized in a two-way table with 1 in the (i j) cell 
if the COj cognitive operation is required by the i-th item (otherwise, 0 in the cell ) . This table, 
referred to here as "I-CO" (Items by Cognitive Operations) facilitates the making of the UTS 
matrix defined in the next procedure. 

2. Determination of the HTS (Hierarchical Test Structure): 

A. first step in this procedure is to make a two-way table (referred here as to HTS matrix) 
with N rows and N columns, where N is the number of test items. We put a value of 1 (one) in the 
cell (i,j) if the set of cognitive operations (CO) required by the j-th item represents a part of the 
set of cognitive operations (CO) required by the i-th item, i.e. i-th item is "superior" to the j-th 
item m the sense defined at the beginning. Otherwise we leave the (i,j) cell empty ; (i, j = 1,2, 
N). In other words, we put 1 in the cell "intersection" of i-th row and j-th column of the HTS 
matrix only if the i-th item requires all cognitive operations required by the j-th item plus at least 
one more cognitive operation. The HTS matrix for the midterm test items (from the data source 
used in the present study) is illustrated by Table 1-M. 



Insert Table 1-M about here 



Similarly, Table 1-F is the HTS matrix for the final test items. 



Insert Table 1-F about here 
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The earlier defined UTS rule, when applied to (he UTS matrix, determines the UTS 
(Hierarchical Test Structure). Fig.l illustrates the HTS inferred from Table 1-M and represen- 
ted in the form of a "tree-type" graph with arrows connecting any two items which are in a hie- 
rarchical cognitive dependence (not all the arrows are given in fig.l, but they can be directly 
"restored" from the HTS matrix.). It can be seen that the HTS for the midterm test contains five 
hierarchically ordered levels, from the lowest ( L t ) to the highest ( L 5 ) level. 



Insert fig. 1 about here 



3. Linear latent trait modeling with (L, I, S) complexity components: 

In order to use the linear latent trait model (1) with complexity components rj, - L,x] x ~I 
and r)j - S, we have to determine the respective coefficients c,j , c\? , and a* for all items (i =1 , 2, 
N). This is to be done by using the HTS information. For example, from fig. I we can see that 
item # 4 is located at the fourth level (L 4 ) which means that L = 4 for this item, i.e. c A \ =4. 
Further, we can see four arrows ending in item ft 4 (or, differently, four l's in the fourth row of 
Table 1-M). Thi. means that there are four items "inferior" to item ft 4 and I = 4 for this item, 
i.e. c n = 4. Finally, there are two airows stalling from item ft 4 (or, differently, two l's in the 
fourth column of Tablel-M). This means that there are two items "superior" to item ft 4 and S = 
2 for this item, i.e. c 3 = 2. 

With (L,I,S) coefficient information for all items, we use the LIN LOG program for the 
calculation of rj), r\ 2 , t)i , the HLTA item difficulties, and the Rasch item difficulties. The data fit 
to the Rasch model was tested by the MICROSCALE program (B.Wright & J. Linacre, 1984). 

4. Multiple regression analysis for (L, I, S) prediction of the item difficulty: 

The dependent variable in this multiple regression analysis is the Rasch item difficulty 
predicted by the complexity components L, I, and S with values calculated in Procedure 3. Of 
primary interest here is the overall contribution of the three predictors for the variability of the 
Rasch item difficulties as well as their partial contribution to this variability. 

5. Item clustering: 

The item clustering is used for an additional HTS validation. The starting matrix || jr v || in 
this procedure has n rows (n students) and N columns (N items) with x l} = 1 if i-th student has 
answered j-th item correctly, otherwise x„ = 0. The inferred matrix of similarity j| .s>|| has N rows 
and N columns with s jk representing the level of similarity between j-th and k-th items, i.e. what 
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proportion of all N values (1 or 0 ) with the same position number in j-th and k-th columns of the 
matrix [|.v„|| arc equal. Then, on the basis of the similarity matrix \\s A \\, all test items are grouped 
in clusters with respective levels of similarity Technically, one can use different cluster programs 
(on SAS, SPSS, SYSTAT, etc. ) with [|r„|| as an input data matrix. 

RESULTS 

Table 2-M is a table-type reprcsentation'of the H I S (Hierarchical Test Structure) for the 
midterm test from our data source. It is a different way to represent the information from 
Table 1-M and fig. 1. The difference is that Table 2-M contains the Rasch item difficulties but does 
not show to which items any given item is "inferior" or "superior", i.e. the UTS "arrows" are 
missing in Table 2-M. 



Insert Table 2-M about here 



In Table 2-M the Rasch difficulty of any item is given in parentheses following the 
number <>f the item. The I and S values, given at the second row in each cell, show the number of 
items "inferior" (rcsp. "superior") to the coi respondent item from the ceil. One can see that items 
1, 10, 9, and 20 are located at the lowest 1ITS level (L = 1) and their Rasch difficulties vary from 
-1.53 to -1.31 in the logit scale. Items 3, 17, 11, 15, 6, and 13 are located the second HTS level 
(I, == 2) and their Rasch difficulties vary from -.99 to .03. Items 5, 12, 16, and 7 are located at the 
thild HTS level (L ~ 3) with Rasch difficulties varying from .14 to .87. Items 4, 8, and 14 are 
located at the fourth HTS level with the range of difficulties from .61 to 2.12. Finally, items 18 
and 19 are located at the highest HTS level (L = 5) with the Rasch item difficulty of this level 
reduced to a single value of 2.39. The explanation of this result is that items 18 and 19 require the 
same cognitive operations. One can also see that the Rasch hem difficulties increase with the 
increase of the HTS level. The difficulty intervals taken successively from all HTS levels "cover" 
virtually the total range from the lowest (-1.53) to the highest (2.39) Rasch item difficulty. 

The results from Table 2-F concern the other ("final"), but they should be interpreted in 
exactly the same way as those from Table 2-M. One can see the same behavior of the relation 
"Rasch item difficulties - HTS levels" as was the case in Table 2-M. 



Insert Table 2-F about here 
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According to model (1), the HLTA diiTicuIy b, of some (i-th) item is calculated as follows: 

(2) 6 / =C„Tli+C /2 Tl2-f-C / 3Tl3+a 

For the midterm test we obtained :r| , = .8738323, i\ 2 = .0792703, t\, = .0534297, and 
normalization constant a = -2.78561 . On the other hand, from the HTS we know the L, I, and S 
values of the i-th item, i.e. we know c n , c a , and c n , respectively. Hence, the calculation of the 
HLTA item difficulties b, is simple enough when using equation (2). Still for the midterm test, 
the correlation between the HLTA difficulties and the Rasch difficulties of the respective 20 items 
was found to be R = .963. This extremely high correlation is consistent with the R 2 - value that 
was found from the multiple regression analysis used for the prediction of the Rasch difficulties by 
the complexity components L, I, and S: R 2 = .9305. The forward stepwise regression analysis 
showed that most of this high prediction is due to the L-component (with R 2 = .9167). The 
partial R 2 (the "over and above" prediction) was found to be .0099 and .0039 for the other two 
components, I and S, respectively. 

Similarly, for the final test we obtained: tii = 072252, rj 2 = .061925, t\ 3 = -.105452, and 
normalization constant a = .01009. The respective HLTA difficulties, calculated on the basis of 
equation (2), highly correlated with the Rasch difficulties for the respective 24 items: R = .948 . 
As in the previous case, this result was consistent with the R 2 - value found from the multiple 
regression analysis for the prediction of Rasch difficulties by the complexity components L, I, and 
S: R 2 = .899. Here, again, the most important prediction factor was the L-component with its 
R 2 = .85. The "over and above" contribution of the other two component, S and I, was .0274 
and .0212, respectively. 

The result from the item clustering (Procedure 5) is illustrated in fig. 2 for the midterm 
test. There are four clusters of items at the highest level of similarity (1.00). If we take the items 
of any of these clusters we will see that they are located at the same HTS level (Table 2-M) and 
their difficulties are the same or very close. For example, the items of the cluster {#18, #19} are 
located at the same HTS level (L=5) and they have the same difficulty (2.39). The items of the 
cluster {#7, #16} are located at the same HTS level (L=3) and their difficulties are .87 and .73, 
respectively. The items of the cluster {#9, #20} are located at the lowest HTS level (L=l) and 
have the same difficulty (-1.31). Finally, the items of the cluster {#11, #15} are located at the 
same HTS level (L=2) and have the same difficulty (-.69). The fifth cluster includes at a very 
high level of similarity (.88) all items except the two most difficult items {#18, #19) and the 
easiest item #2. The last cluster combines the fifth cluster and item #2 at level of similarity .77 
which is still very high. 



Insert fig. 2 about here 
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The above interpretation of fig. 2 suggests a high similarity of the item response patterns. 
This can be explained by the relatively high homogeneity of the student samples which is typical 
for most university courses. Further, the clustering of the items is consistent with their allocation 
at the H IS levels and their difficulties which is important for the HI/FA validation. 

DISCUSSION 

The Hierarchical Latent Trait Approach (IILTA), as presently defined and illustrated, 
provides both quantitative and qualitative information about parameters and relations of main 
interest in test analysis. The "new thing" with the IILTA is the UTS cognitive model with four 
characteristics of the test items : L (UTS level), I ("input" cognitive infoimation), S ("output" 
cognitive information), and (I-S)R ("inferiorily'V'supcriority" relations) given, for c,\ ,mple, by 
"arrows" between some items in the UTS graph representation (see fig. 1). 

The terminology of "hierarchically ordered levels of test items" is used in some cognitive 
models for (esc analysis but in a sense quite different from this of the UTS model. For example 
Gilomer & Rock (1993) define the following three hierarchical levels: Level 1 = Task recognition 
+ applicati on of simple rules; Level 2 = Insight + application of simple rules; Level 3 - Insight + 
production i- application of simple rules. In this type cf models the items are categorized in a 
fixed number of levels on the base of subjective expert judgments, whereas the UTS rule 
objectively infers the number of levels and the item allocation from the UTS matrix (as defined at 
the beginning). Moreover, the I ITS model provides the above mentioned item characteristics I, S, 
and (l-S)R which are not inherent in the other models. In this sense, the UTS model is unique 
in its cognitive characteristics and provides quantitative coefficients of L, I, and S that makes 
possible their use as complexity components in the linear latent ti ait model (1). Althrugh in both 
midterm and final tests the I and S components were almost negligible in comparison with the 
L-component for the prediction of the item difficulties, we still recommend their use in the HLTA 
expecting that they may play more significant role in some different type of testing situations. 

The HLTA validation starts with expert-based judgments about the appropriateness of 
the cognitive operations in the context of content domain, testing purposes, etc. In this particular 
HLTA illustration we defined nine cognitive operations (CO,, C0 9 ) on the basis of which the 
I ITS was objectively determined. Further, the LINLOG program tests in two different ways the 
adequacy of the complexity components L, I, and S, inferred from the UTS for the linear latent 
trait modeling: (A) a log likelihood x 2 test for differences in goodness of fit between model (1) 
a'-d the Rasch model ; (B) Pearson correlation between the Rasch item difficulties and the item 
difficulties estimated by model (1). What we obtained, for example, from the midterm test data 
was respectively: (A) % 2 = 18 51 witl1 l (f~ 17 anJ ( B ) ft = -9G3. From the final test data we 
obtained: (A) % 2 ---69.35 with cif = 21 and (B) R = .847. Interpreting these results we can say 
that the % 2 value, for both midterm and final tests, shows no significant difference in goodness of 
fit between model (1) and the Rasch model. At the same time the correlation coefficient R shows 



extremely good prediction of the Rasch item difficulties from the complexity components L, I, 
and S for both the midterm and final tests. 

Despite the relatively small student sample sizes, we used the Rasch item difficulties for 
the purposes of the HLTA validation, relying on the replication of the HLTA with midterm and 
final tests. In both cases the data fit die Rasch model according to the "outfit/infit" rules (Wright 
& Linacre 1984, Chapter 4, pp. 1 - 34). Moreover, for both midterm and final tests we obtained 
an additional HLTA validation by the consistency of the results from all HLTA procedures - 
HST developing, linear latent trait modeling, multiple regression analysis and cluster analysis. 

In conclusion we can say that the results from the HLTA procedures provide different 
pieces of information for making diagnostic decisions about item characteristics, students' 
abilities and cognitive processes required to solve problems within a test. As presently applied, 
the HLTA is related to the testing of achievement on cognitive items. However, if, instead of 
cognitive operations, some characteristics influencing the score on personality items are defined, 
it can be used for obtaining both quantitative and qualitative information from a personality test. 
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Table 1-M 



HTS matrix for the midterm test: (i, j)=l means that i-th item is "superior" to j-th item 
I = number of items "inferior" to i-th item ; S = number of items "superior" to j-th item 
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Table 1-F 



HTS matrix for the final test: (i,j)=l means that i-th item is "superior" to j-th item 
I=number of items "inferior" to i-th item; S=number of items "superior" to j-th item 
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1 


1 




1 


1 


1 


1 




1 


1 


1 


1 




1 


1 


1 


1 


23 


16 


























1 
























1 


17 


1 


1 


1 


1 


1 
































1 








8 


18 


1 


1 


1 


1 


1 


1 


1 








1 


1 


1 






1 


1 




1 












13 


19 












1 






1 
























1 








5 


20 


















































0 


21 


1 


1 




1 


1 








































6 


22 


1 


1 


1 


1 


1 


1 










1 


1 


1 






1 




1 






1 








14 


23 


1 


1 


1 


1 


1 


1 










1 


1 


1 






1 




1 






1 








14 


24 


1 


1 


1 


1 


1 


1 


1 


1 


1 




1 


1 


1 






1 


1 


1 


1 












18 


S 


13 


13 


11 


13 


13 


9 


4 


2 


5 


13 


7 


9 


10 


1 


0 


9 


5 


5 


3 


13 


7 


1 


1 


1 
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Table 2-M 



Table form of the HTS for the midterm test (with the Rasch difficulties given in parentheses) 



HTS level 


Item # 


Item # 


Item # 


Item# 


Item # 


Item # 


L= 1 


#2 (-2.36) 
1=0 ; S=0 


# 1 (-1.53) 
1=0 ; S=14 


# 10 (-1.53) 
1=0 ; S=5 


# 9 (-1.31) 
1=0 ; S=8 


#20 (-1.31) 
1=0 ; S=8 




L = 2 


# 3 ( - .99) 
1=1 ; S=6 


# 17 (- .89) 
1=2 ; S=2 


# 11 (- .69) 
1=3 ; S=5 


# 15 (- .69) 
1=3 ; S=6 


# 6 (- .38) 
1=1 ; S=4 


#13 (.03) 
1=1 ; S=8 


L = 3 


#5 (.14) 
1=2 ; S=5 


#12 (.73) 
1=6 ; S=3 


#16 (.73) 
1=6 ; S=3 


#7 (.87) 
1=6 ; S=3 






L = 4 


#4 (.61) 
1=4 ; S=2 


#8 ( 1.67) 
1=11 ; S=0 


#14(2.12) 

1=6 ; S=2 








L = 5 


# 18(2.39) 
1=16 ;S=0 


# 19 (2.39) 
1=16 ; S=0 
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Table 2-F 



Table form of the HTS for the final test (with Rasch difficulr : es given in parentheses) 



Level 


Item # 


item ff 


item ff 


item 


item ft 


item ff 


item ff 


item ff 


L= 1 


# l(-2.06) 
t— n-c— i i 


#10(-1.9) 
t— n- c— i 1 


#20(-1.9) 

T— O C— 1 1 
1— U, o— l J 


#4(-I.6) 

I— U,i5— 1 J 


#13(-1.3) 
t— n-c— i n 


# 3 (-.86) 

t— n-c— i 1 
1— U,o— 1 1 


# 5 (-.65) 
t— n-c— 1 1 


#2 (-.55) 

t— n-c— i 1 
1— U,S>— 1 J 


L = 2 


#6 (-.25) 

T— 1 • Q— Q 


#21 (.03) 














L=3 


#11 (.03) 

1 o , o / 


# 12 (.12) 

T=Q- C=Q 

1 7, O 7 


#17 (.12) 

i_o . c_r 
10 , o — J 












L=4 


#6 (.22) 
















L=5 


#7 (.32) 
I=10;S=4 


#9 (.32) 
I=11;S=7 














L=6 


#19 (.41) 
1=5; S=3 


#8 (.61) 
I=11;S=2 














L=i 


#18 (.83) 
I=13;S=5 
















L=8 


#14(1.18) 
1=7; S=l 


#23(1.44) 
1= 14;S=1 


1 22(1.58) 
!-14;S=l 












L=9 


#24(1.73) 
I=18;S=1 
















L=10 


# 15(2.08) 
1=23 ;S=0 
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